Introduction to TCP/IP2 Feb 1995 
Introduction to TCP/IP 
Summary: TCP and IP were developed by a Department of Defense (DOD) research 
project to connect a number different networks designed by different vendors 
into a network of networks (the "Internet"). It was initially successful because 
it delivered a few basic services that everyone needs (file transfer, electronic 
mail, remote logon) across a very large number of client and server systems. 
Several computers in a small department can use TCP/IP (along with other 
protocols) on a single LAN. The IP component provides routing from the 
department to the enterprise network, then to regional networks, and finally to 
the global Internet. On the battlefield a communications network will sustain 
damage, so the DOD designed TCP/IP to be robust and automatically recover from 
any node or phone line failure. This design allows the construction of very 
large networks with less central management. However, because of the automatic 
recovery, network problems can go undiagnosed and uncorrected for long periods 
of time. 
As with all other communications protocol, TCP/IP is composed of layers: 
  IP - is responsible for moving packet of data from node to node. IP forwards 
  each packet based on a four byte destination address (the IP number). The 
  Internet authorities assign ranges of numbers to different organizations. The 
  organizations assign groups of their numbers to departments. IP operates on 
  gateway machines that move data from department to organization to region and 
  then around the world. 
  TCP - is responsible for verifying the correct delivery of data from client to 
  server. Data can be lost in the intermediate network. TCP adds support to 
  detect errors or lost data and to trigger retransmission until the data is 
  correctly and completely received. 
  Sockets - is a name given to the package of subroutines that provide access to 
  TCP/IP on most systems. 
Network of Lowest Bidders 
The Army puts out a bid on a computer and DEC wins the bid. The Air Force puts 
out a bid and IBM wins. The Navy bid is won by Unisys. Then the President 
decides to invade Grenada and the armed forces discover that their computers 
cannot talk to each other. The DOD must build a "network" out of systems each of 
which, by law, was delivered by the lowest bidder on a single contract. 
 
The Internet Protocol was developed to create a Network of Networks (the 
"Internet"). Individual machines are first connected to a LAN (Ethernet or Token 
Ring). TCP/IP shares the LAN with other uses (a Novell file server, Windows for 
Workgroups peer systems). One device provides the TCP/IP connection between the 
LAN and the rest of the world. 
To insure that all types of systems from all vendors can communicate, TCP/IP is 
absolutely standardized on the LAN. However, larger networks based on long 
distances and phone lines are more volatile. In the US, many large corporations 
would wish to reuse large internal networks based on IBM's SNA. In Europe, the 
national phone companies traditionally standardize on X.25. However, the sudden 
explosion of high speed microprocessors, fiber optics, and digital phone systems 
has created a burst of new options: ISDN, frame relay, FDDI, Asynchronous 
Transfer Mode (ATM). New technologies arise and become obsolete within a few 
years. With cable TV and phone companies competing to build the National 
Information Superhighway, no single standard can govern citywide, nationwide, or 
worldwide communications. 
The original design of TCP/IP as a Network of Networks fits nicely within the 
current technological uncertainty. TCP/IP data can be sent across a LAN, or it 
can be carried within an internal corporate SNA network, or it can piggyback on 
the cable TV service. Furthermore, machines connected to any of these networks 
can communicate to any other network through gateways supplied by the network 
vendor. 
Addresses 
Each technology has its own convention for transmitting messages between two 
machines within the same network. On a LAN, messages are sent between machines 
by supplying the six byte unique identifier (the "MAC" address). In an SNA 
network, every machine has Logical Units with their own network address. DECNET, 
Appletalk, and Novell IPX all have a scheme for assigning numbers to each local 
network and to each workstation attached to the network. 
On top of these local or vendor specific network addresses, TCP/IP assigns a 
unique number to every workstation in the world. This "IP number" is a four byte 
value that, by convention, is expressed by converting each byte into a decimal 
number (0 to 255) and separating the bytes with a period. For example, the PC 
Lube and Tune server is 130.132.59.234. 
An organization begins by sending electronic mail to Hostmaster@INTERNIC.NET 
requesting assignment of a network number. It is still possible for almost 
anyone to get assignment of a number for a small "Class C" network in which the 
first three bytes identify the network and the last byte identifies the 
individual computer. The author followed this procedure and was assigned the 
numbers 192.35.91.* for a network of computers at his house. Larger 
organizations can get a "Class B" network where the first two bytes identify the 
network and the last two bytes identify each of up to 64 thousand individual 
workstations. Yale's Class B network is 130.132, so all computers with IP 
address 130.132.*.* are connected through Yale. 
The organization then connects to the Internet through one of a dozen regional 
or specialized network suppliers. The network vendor is given the subscriber 
network number and adds it to the routing configuration in its own machines and 
those of the other major network suppliers. 
There is no mathematical formula that translates the numbers 192.35.91 or 
130.132 into "Yale University" or "New Haven, CT." The machines that manage 
large regional networks or the central Internet routers managed by the National 
Science Foundation can only locate these networks by looking each network number 
up in a table. There are potentially thousands of Class B networks, and millions 
of Class C networks, but computer memory costs are low, so the tables are 
reasonable. Customers that connect to the Internet, even customers as large as 
IBM, do not need to maintain any information on other networks. They send all 
external data to the regional carrier to which they subscribe, and the regional 
carrier maintains the tables and does the appropriate routing. 
New Haven is in a border state, split 50-50 between the Yankees and the Red Sox. 
In this spirit, Yale recently switched its connection from the Middle Atlantic 
regional network to the New England carrier. When the switch occurred, tables in 
the other regional areas and in the national spine had to be updated, so that 
traffic for 130.132 was routed through Boston instead of New Jersey. The large 
network carriers handle the paperwork and can perform such a switch given 
sufficient notice. During a conversion period, the university was connected to 
both networks so that messages could arrive through either path. 
Subnets 
Although the individual subscribers do not need to tabulate network numbers or 
provide explicit routing, it is convenient for most Class B networks to be 
internally managed as a much smaller and simpler version of the larger network 
organizations. It is common to subdivide the two bytes available for internal 
assignment into a one byte department number and a one byte workstation ID. 
 
The enterprise network is built using commercially available TCP/IP router 
boxes. Each router has small tables with 255 entries to translate the one byte 
department number into selection of a destination Ethernet connected to one of 
the routers. Messages to the PC Lube and Tune server (130.132.59.234) are sent 
through the national and New England regional networks based on the 130.132 part 
of the number. Arriving at Yale, the 59 department ID selects an Ethernet 
connector in the C& IS building. The 234 selects a particular workstation on 
that LAN. The Yale network must be updated as new Ethernets and departments are 
added, but it is not effected by changes outside the university or the movement 
of machines within the department. 
A Uncertain Path 
Every time a message arrives at an IP router, it makes an individual decision 
about where to send it next. There is concept of a session with a preselected 
path for all traffic. Consider a company with facilities in New York, Los 
Angeles, Chicago and Atlanta. It could build a network from four phone lines 
forming a loop (NY to Chicago to LA to Atlanta to NY). A message arriving at the 
NY router could go to LA via either Chicago or Atlanta. The reply could come 
back the other way. 
How does the router make a decision between routes? There is no correct answer. 
Traffic could be routed by the "clockwise" algorithm (go NY to Atlanta, LA to 
Chicago). The routers could alternate, sending one message to Atlanta and the 
next to Chicago. More sophisticated routing measures traffic patterns and sends 
data through the least busy link. 
If one phone line in this network breaks down, traffic can still reach its 
destination through a roundabout path. After losing the NY to Chicago line, data 
can be sent NY to Atlanta to LA to Chicago. This provides continued service 
though with degraded performance. This kind of recovery is the primary design 
feature of IP. The loss of the line is immediately detected by the routers in NY 
and Chicago, but somehow this information must be sent to the other nodes. 
Otherwise, LA could continue to send NY messages through Chicago, where they 
arrive at a "dead end." Each network adopts some Router Protocol which 
periodically updates the routing tables throughout the network with information 
about changes in route status. 
If the size of the network grows, then the complexity of the routing updates 
will increase as will the cost of transmitting them. Building a single network 
that covers the entire US would be unreasonably complicated. Fortunately, the 
Internet is designed as a Network of Networks. This means that loops and 
redundancy are built into each regional carrier. The regional network handles 
its own problems and reroutes messages internally. Its Router Protocol updates 
the tables in its own routers, but no routing updates need to propagate from a 
regional carrier to the NSF spine or to the other regions (unless, of course, a 
subscriber switches permanently from one region to another). 
Undiagnosed Problems 
IBM designs its SNA networks to be centrally managed. If any error occurs, it is 
reported to the network authorities. By design, any error is a problem that 
should be corrected or repaired. IP networks, however, were designed to be 
robust. In battlefield conditions, the loss of a node or line is a normal 
circumstance. Casualties can be sorted out later on, but the network must stay 
up. So IP networks are robust. They automatically (and silently) reconfigure 
themselves when something goes wrong. If there is enough redundancy built into 
the system, then communication is maintained. 
In 1975 when SNA was designed, such redundancy would be prohibitively expensive, 
or it might have been argued that only the Defense Department could afford it. 
Today, however, simple routers cost no more than a PC. However, the TCP/IP 
design that, "Errors are normal and can be largely ignored," produces problems 
of its own. 
Data traffic is frequently organized around "hubs," much like airline traffic. 
One could imagine an IP router in Atlanta routing messages for smaller cities 
throughout the Southeast. The problem is that data arrives without a 
reservation. Airline companies experience the problem around major events, like 
the Super Bowl. Just before the game, everyone wants to fly into the city. After 
the game, everyone wants to fly out. Imbalance occurs on the network when 
something new gets advertised. Adam Curry announced the server at "mtv.com" and 
his regional carrier was swamped with traffic the next day. The problem is that 
messages come in from the entire world over high speed lines, but they go out to 
mtv.com over what was then a slow speed phone line. 
Occasionally a snow storm cancels flights and airports fill up with stranded 
passengers. Many go off to hotels in town. When data arrives at a congested 
router, there is no place to send the overflow. Excess packets are simply 
discarded. It becomes the responsibility of the sender to retry the data a few 
seconds later and to persist until it finally gets through. This recovery is 
provided by the TCP component of the Internet protocol. 
TCP was designed to recover from node or line failures where the network 
propagates routing table changes to all router nodes. Since the update takes 
some time, TCP is slow to initiate recovery. The TCP algorithms are not tuned to 
optimally handle packet loss due to traffic congestion. Instead, the traditional 
Internet response to traffic problems has been to increase the speed of lines 
and equipment in order to say ahead of growth in demand. 
TCP treats the data as a stream of bytes. It logically assigns a sequence number 
to each byte. The TCP packet has a header that says, in effect, "This packet 
starts with byte 379642 and contains 200 bytes of data." The receiver can detect 
missing or incorrectly sequenced packets. TCP acknowledges data that has been 
received and retransmits data that has been lost. The TCP design means that 
error recovery is done end-to-end between the Client and Server machine. There 
is no formal standard for tracking problems in the middle of the network, though 
each network has adopted some ad hoc tools. 
Need to Know 
There are three levels of TCP/IP knowledge. Those who administer a regional or 
national network must design a system of long distance phone lines, dedicated 
routing devices, and very large configuration files. They must know the IP 
numbers and physical locations of thousands of subscriber networks. They must 
also have a formal network monitor strategy to detect problems and respond 
quickly. 
Each large company or university that subscribes to the Internet must have an 
intermediate level of network organization and expertise. A half dozen routers 
might be configured to connect several dozen departmental LANs in several 
buildings. All traffic outside the organization would typically be routed to a 
single connection to a regional network provider. 
However, the end user can install TCP/IP on a personal computer without any 
knowledge of either the corporate or regional network. Three pieces of 
information are required: 
  The IP address assigned to this personal computer 
  The part of the IP address (the subnet mask) that distinguishes other machines 
  on the same LAN (messages can be sent to them directly) from machines in other 
  departments or elsewhere in the world (which are sent to a router machine) 
  The IP address of the router machine that connects this LAN to the rest of the 
  world. 
In the case of the PCLT server, the IP address is 130.132.59.234. Since the 
first three bytes designate this department, a "subnet mask" is defined as 
255.255.255.0 (255 is the largest byte value and represents the number with all 
bits turned on). It is a Yale convention (which we recommend to everyone) that 
the router for each department have station number 1 within the department 
network. Thus the PCLT router is 130.132.59.1. Thus the PCLT server is 
configured with the values: 
  My IP address: 130.132.59.234 
  Subnet mask: 255.255.255.0 
  Default router: 130.132.59.1 
The subnet mask tells the server that any other machine with an IP address 
beginning 130.132.59.* is on the same department LAN, so messages are sent to it 
directly. Any IP address beginning with a different value is accessed indirectly 
by sending the message through the router at 130.132.59.1 (which is on the 
departmental LAN). 
Additional information is available in self-study courses from SRA 
(1-800-SRA-1277) 
  TCP/IP [34610] 
Copyright 1995 PCLT -- Introduction to TCP/IP -- H. Gilbert 
This document generated by SpHyDir another fine product of PC Lube and Tune.
